<?xml version="1.0" encoding="UTF-8"?><xml><records><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Vassiliki A. Koutsonikola</style></author><author><style face="normal" font="default" size="100%">Petridou, Sophia G.</style></author><author><style face="normal" font="default" size="100%">Athena Vakali</style></author><author><style face="normal" font="default" size="100%">Papadimitriou, Georgios I.</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">A new approach to web users clustering and validation: a divergence-based scheme</style></title><secondary-title><style face="normal" font="default" size="100%">IJWIS</style></secondary-title></titles><keywords><keyword><style  face="normal" font="default" size="100%">Cluster analysis</style></keyword><keyword><style  face="normal" font="default" size="100%">Internet Data mining</style></keyword><keyword><style  face="normal" font="default" size="100%">User studies</style></keyword></keywords><dates><year><style  face="normal" font="default" size="100%">2009</style></year></dates><number><style face="normal" font="default" size="100%">3</style></number><volume><style face="normal" font="default" size="100%">5</style></volume><pages><style face="normal" font="default" size="100%">348-371</style></pages><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;Purpose â€“ Web usersâ€™ clustering is an important mining task since it contributes in identifying usagepatterns, a beneficial task for a wide range of applications that rely on the web. The purpose of thispaper is to examine the usage of Kullback-Leibler (KL) divergence, an information theoretic distance,as an alternative option for measuring distances in web users clustering.Design/methodology/approach â€“ KL-divergence is compared with other well-known distancemeasures and clustering results are evaluated using a criterion function, validity indices, andgraphical representations. Furthermore, the impact of noise (i.e. occasional or mistaken page visits) isevaluated, since it is imperative to assess whether a clustering process exhibits tolerance in noisyenvironments such as the web.Findings â€“ The proposed KL clustering approach is of similar performance when compared withother distance measures under both synthetic and real data workloads. Moreover, imposing extranoise on real data, the approach shows minimum deterioration among most of the other conventionaldistance measures.Practical implications â€“ The experimental results show that a probabilistic measure such asKL-divergence has proven to be quite efficient in noisy environments and thus constitute a goodalternative, the web users clustering problem.Originality/value â€“ This work is inspired by the usage of divergence in clustering of biological dataand it is introduced by the authors in the area of web clustering. According to the experimental resultspresented in this paper, KL-divergence can be considered as a good alternative for measuringdistances in noisy environments such as the web.&lt;/p&gt;
</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Vassiliki A. Koutsonikola</style></author><author><style face="normal" font="default" size="100%">Petridou, Sophia G.</style></author><author><style face="normal" font="default" size="100%">Athena Vakali</style></author><author><style face="normal" font="default" size="100%">Hacid, Hakim</style></author><author><style face="normal" font="default" size="100%">Benatallah, Boualem</style></author></authors><secondary-authors><author><style face="normal" font="default" size="100%">Bailey, James</style></author><author><style face="normal" font="default" size="100%">Maier, David</style></author><author><style face="normal" font="default" size="100%">Schewe, Klaus-Dieter</style></author><author><style face="normal" font="default" size="100%">Thalheim, Bernhard</style></author><author><style face="normal" font="default" size="100%">Wang, Xiaoyang Sean</style></author></secondary-authors></contributors><titles><title><style face="normal" font="default" size="100%">Correlating Time-Related Data Sources with Co-clustering</style></title><secondary-title><style face="normal" font="default" size="100%">WISE</style></secondary-title><tertiary-title><style face="normal" font="default" size="100%">Lecture Notes in Computer Science</style></tertiary-title></titles><dates><year><style  face="normal" font="default" size="100%">2008</style></year></dates><publisher><style face="normal" font="default" size="100%">Springer</style></publisher><volume><style face="normal" font="default" size="100%">5175</style></volume><pages><style face="normal" font="default" size="100%">264-279</style></pages><isbn><style face="normal" font="default" size="100%">978-3-540-85480-7</style></isbn><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">&lt;p&gt;A huge amount of data is circulated and collected every dayon a regular time basis. Given a pair of such datasets, it might be possibleto reveal hidden dependencies between them since the presence of the onedataset elements may influence the elements of the other dataset and viceversa. Furthermore, the impact of these relations may last during a periodinstead of the time point of their co-occurrence. Mining such relationsunder those assumptions is a challenging problem. In this paper, we studytwo time-related datasets whose elements are bilaterally affected overtime. We employ a co-clustering approach to identify groups of similarelements on the basis of two distinct criteria: the direction and durationof their impact. The proposed approach is evaluated using time-relatednews and stockâ€™s market real datasets.&lt;/p&gt;
</style></abstract></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>17</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Petridou, Sophia G.</style></author><author><style face="normal" font="default" size="100%">Vassiliki A. Koutsonikola</style></author><author><style face="normal" font="default" size="100%">Athena Vakali</style></author><author><style face="normal" font="default" size="100%">Papadimitriou, Georgios I.</style></author></authors></contributors><titles><title><style face="normal" font="default" size="100%">Time-Aware Web Users’ Clustering</style></title><secondary-title><style face="normal" font="default" size="100%">IEEE Trans. Knowl. Data Eng.</style></secondary-title></titles><dates><year><style  face="normal" font="default" size="100%">2008</style></year></dates><number><style face="normal" font="default" size="100%">5</style></number><volume><style face="normal" font="default" size="100%">20</style></volume><pages><style face="normal" font="default" size="100%">653-667</style></pages><language><style face="normal" font="default" size="100%">eng</style></language></record><record><source-app name="Biblio" version="7.x">Drupal-Biblio</source-app><ref-type>47</ref-type><contributors><authors><author><style face="normal" font="default" size="100%">Petridou, Sophia G.</style></author><author><style face="normal" font="default" size="100%">Vassiliki A. Koutsonikola</style></author><author><style face="normal" font="default" size="100%">Athena Vakali</style></author><author><style face="normal" font="default" size="100%">Papadimitriou, Georgios I.</style></author></authors><secondary-authors><author><style face="normal" font="default" size="100%">Gavrilova, Marina L.</style></author><author><style face="normal" font="default" size="100%">Gervasi, Osvaldo</style></author><author><style face="normal" font="default" size="100%">Kumar, Vipin</style></author><author><style face="normal" font="default" size="100%">Tan, Chih Jeng Kenneth</style></author><author><style face="normal" font="default" size="100%">Taniar, David</style></author><author><style face="normal" font="default" size="100%">LaganĂ , Antonio</style></author><author><style face="normal" font="default" size="100%">Mun, Youngsong</style></author><author><style face="normal" font="default" size="100%">Choo, Hyunseung</style></author></secondary-authors></contributors><titles><title><style face="normal" font="default" size="100%">A Divergence-Oriented Approach for Web Users Clustering</style></title><secondary-title><style face="normal" font="default" size="100%">ICCSA (2)</style></secondary-title><tertiary-title><style face="normal" font="default" size="100%">Lecture Notes in Computer Science</style></tertiary-title></titles><dates><year><style  face="normal" font="default" size="100%">2006</style></year></dates><publisher><style face="normal" font="default" size="100%">Springer</style></publisher><volume><style face="normal" font="default" size="100%">3981</style></volume><pages><style face="normal" font="default" size="100%">1229-1238</style></pages><isbn><style face="normal" font="default" size="100%">3-540-34072-6</style></isbn><language><style face="normal" font="default" size="100%">eng</style></language><abstract><style face="normal" font="default" size="100%">Clustering web users based on their access patterns is a quite significanttask in Web Usage Mining. Further to clustering it is important to evaluatethe resulted clusters in order to choose the best clustering for a particular framework.This paper examines the usage of Kullback-Leibler divergence, aninformation theoretic distance, in conjuction with the k-means clusteringalgorithm. It compares KL-divergence with other well known distance measures(Euclidean, Standardized Euclidean and Manhattan) and evaluates clusteringresults using both objective functionâ€™s value and Davies-Bouldin index.Since it is imperative to assess whether the results of a clustering process aresusceptible to noise, especially in noisy environments such as Web environment,our approach takes the impact of noise into account. The clusters obtainedwith KL approach seem to be superior to those obtained with the otherdistance measures in case our data have been corrupted by noise.</style></abstract></record></records></xml>